*! version 5.0
* 13 August 2018
* NIDS

* THIS IS A FOOD AND NON-FOOD EXPENDITURE DO FILE: 6 OF 14

*=====================================================================================================================================
* GLOBALS FOR DATA FILES, DO FILES AND VERSION SUFFIXES

* DEFINED IN "W1 Food_NonFood Expenditure - Master  Food_NonFood Expenditure do file  (1 of 14).do"

*=====================================================================================================================================
* SETTING UP STATA TO RUN DO FILES

clear
cap clear matrix
set more off 

**********************************************************************
*					Food 3
***			Aggregate Imputations and Overview
**********************************************************************

**********************************************************************
***			Preparation
**********************************************************************

use "$DataOUT\tempdata4.dta", clear

egen countNoFood = rowtotal(e1_2_*)
egen psutotal=total(totalfood), by(w1_cluster)
egen psucount=count(totalfood), by(w1_cluster)
*egen psutotalimp=total(aggfoodimp), by(w1_cluster)
*egen psucountimp=count(aggfoodimp), by(w1_cluster)

*********************************************************************
***		Imputing Missing Aggregates: Regression Method
**********************************************************************

*Aggregates all of the regression imputed measures and replaces the missing values with e1_1 (the one shot response to food consumption) if this was answered
egen aggfoodimp =rowtotal(imptotal*)
replace aggfoodimp=. if aggfoodimp==0
replace aggfoodimp=e1_1 if aggfoodimp==.&e1_1!=.
gen lgaggfoodimp=log(aggfoodimp)

*Uses a regression imputation to replace the missing aggregate values with 
gen foodagg2=aggfoodimp
reg lgaggfoodimp lgincome w1_h_dwlrms westerncape easterncape northerncape freestate kwazulunatal northwest gauteng mpumalanga  ///
employ fridge urban hhsizer maxage fammatric  Asian White Coloured anychildren

impute lgaggfoodimp lgincome w1_h_dwlrms westerncape easterncape northerncape freestate kwazulunatal northwest gauteng mpumalanga  ///
employ fridge urban hhsizer maxage fammatric  Asian White Coloured anychildren, gen (foodagglg)

replace foodagg2 =exp(foodagglg) if foodagg2==.

**********************************************************************
***		Imputing Missing Aggregates: Cell Median Method
**********************************************************************

*Aggregates all of the median imputed measures and replaces the missing values with e1_1 (the one shot response to food consumption) if this was answered
egen fmedimpagg = rowtotal(fmedianimp*)
replace fmedimpagg=. if fmedimpagg==0
replace aggfoodimp=e1_1 if aggfoodimp==.&e1_1!=.

*Generates PSU, District and Provincial medians of the current aggregate and replaces missing values with these medians
egen fmedimpaggmed = median(fmedimpagg), by(w1_cluster)
replace fmedimpagg =fmedimpaggmed if fmedimpagg ==.
egen fmedimpaggmeddis = median(fmedimpagg), by(w1_dc2011)
replace fmedimpagg =fmedimpaggmeddis if fmedimpagg ==.
egen fmedimpaggmedprov = median(fmedimpagg), by(province)
replace fmedimpagg =fmedimpaggmedprov if fmedimpagg ==.

**********************************************************************
***			Summaries and Rates of Imputation
**********************************************************************

replace totalfood=. if totalfood==0
sum foodagg2 fmedimpagg totalfood, detail 

gen foodimputed = 1 if foodagg2!=.&aggfoodimp==.
replace foodimputed=0 if foodagg2==.|aggfoodimp!=.

gen foodimputedpartial=1 if foodimputed==1|foodsubimp>0
replace foodimputedpartial=0 if foodimputed==0&foodsubimp==0

sum foodagg2 fmedimpagg totalfood if foodimputed ==1, detail
sum foodagg2 fmedimpagg totalfood if foodimputedpartial ==1, detail

**********************************************************************
***			Cleaning Up, Dropping, Labelling
**********************************************************************

*Food Medians and counts
drop psumean* psucount* psutot* dismedian* f*counter psufrate* disfsize* disfcount*  
drop psumedian* psufsize* disfrate* psu* diffsign 

*Food Regression Imputation Data
drop total*lg total*lgimp

**Impuation Rate Data and analysis of imputed data
drop small* fmedianimp*meds imptotal*imps noImputed* ftotalmeds ftotalimps impcount*

label var totalfood "Raw Aggregate of all food items (no imputation)"
forvalues i=1/32{
label var total`i' "Raw total for Food Item `i'"
}

forvalues i=1/32{
rename imptotal`i' freg`i'
label var freg`i' "Consumption for Food Item `i' using Regression Imputation"
}
*Food Median Imputation Data
drop provmedian* 

forvalues i=1/32{
rename fmedianimp`i' fmed`i'
label var fmed`i' "Consumption for Food Item `i' using Cell Median Imputation "
}

forvalues a=1/32{
rename total`a'imputed f`a'imputed
label var f`a'imputed "Dummy Variable for Food Item `a' being an imputed value"
}

rename countNoFood fooditemcount
label var fooditemcount "Number of Food Items Consumed by this Household"

**Aggregate Data
drop aggfoodimp lgaggfoodimp foodagglg
rename foodagg2 fregagg
label var fregagg "Aggregate Food Consumption using Regression Imputation"

rename foodsubimp fooditemimps
label var fooditemimps "Number of Food Items Imputed for this Household"

rename fmedimpagg fmedagg
label var fmedagg "Aggregate Food Consumption using Cell Median Imputation"
drop  fmedimpaggmed fmedimpaggmedprov

***Overall Imputaton Rates
label var foodimputed "Dummy Variable for Aggregate Food value being totally imputed"
label var foodimputedpartial "Dummy Variable for Aggregate Food value having one or more of its components imputed"

*---------------------------------------------------------------------------------------------------------------------------------
save "$DataOUT\tempdata5.dta", replace
